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Abstract 



In this paper, we introduce a new, local formulation of the ensemble Kalman Filter approach 
for atmospheric data assimilation. Our scheme is based on the hypothesis that, when the 
Earth's surface is divided up into local regions of moderate size, vectors of the forecast uncer- 
tainties in such regions tend to lie in a subspace of much lower dimension than that of the full 
atmospheric state vector of such a region. Ensemble Kalman Filters, in general, assume that 
the analysis resulting from the data assimilation lies in the same subspace as the expected 
forecast error. Under our hypothesis the dimension of this subspace is low. This implies that 
operations only on relatively low dimensional matrices are required. Thus, the data analysis 
is done locally in a manner allowing massively parallel computation to be exploited. The 
local analyses are then used to construct global states for advancement to the next forecast 
time. The method, its potential advantages, properties, and implementation requirements 
are illustrated by numerical experiments on the Lorenz-96 model. It is found that accurate 
analysis can be achieved at a cost which is very modest compared to that of a full global 
ensemble Kalman Filter. 



1 Introduction 



The purpose of this paper is to develop and test a new atmospheric data assimilation scheme, 
which we call the Local Ensemble Kalman Filter method. Atmospheric data assimilation 
{analysis) is the process through which an estimate of the atmospheric state is obtained by 
using observed data and a dynamical model of the atmosphere (e.g., Daley 1991; Kalnay 
2002). These estimates, called analyses, can then be used as initial conditions in operational 
numerical weather predictions. In addition, diagnostic studies of atmospheric dynamics and 
climate are also often based on analyses instead of raw observed data. 

The analysis at a given time instant is an approximation to a maximum likelihood estimate 
of the atmospheric state in which a short-term forecast, usually referred to as the background 
or first guess field, is used as a prior estimate of the atmospheric state (Lorenc 1986). Then 
the observations are assimilated into the background field by a statistical interpolation. This 
interpolation is performed based on the assumptions (i) that the uncertainties in the back- 
ground field and the observations are unbiased and normally distributed, (ii) that there are 
no cross correlations between background and observations errors, and (Hi) that the covari- 
ance between different components of the background (formally the the background covariance 
matrix) and the covariances between uncertainties in the noisy observations (formally the ob- 
servational error covariance matrix) are known. In reality, however, the background error 
covariance matrix cannot be directly computed. The implementation of a data assimilation 
system, therefore, requires the development of statistical models that can provide an esti- 
mate of the background error covariance matrix. The quality of a data assimilation system is 
primarily determined by the accuracy of this estimate. 

In the case of linear dynamics the mathematically consistent technique to define an adap- 
tive background covariance matrix is the Kalman Filter (Kalman 1960; Kalman and Bucy 
1961) which utilizes the dynamical equations to evolve the most probable state and the error 
covariance matrix in time. In the case of linear systems with unbiased normally distributed 
errors the Kalman Filter provides estimates of the system state that are optimal in the mean 
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square sense. The method has also been adapted to nonhnear systems, but, in this case, 
optimahty no longer applies. Although the Kalman Filter approach has been successfully 
implemented for a wide range of applications and has been considered for atmospheric data 
assimilation for a long while (Jones 1965; Petersen 1973; Ghil et al. 1981, Dee et al., 1985), 
the computational cost involved does not allow for an operational implementation in the 
foreseeable future (see Daley 1991 for details). 

The currently most popular approach to reduce the cost of the Kalman Filter is to use a 
relatively small (10-100 member) ensemble of background forecasts to estimate the background 
error covariances (e.g. Evensen 1994; Houtekamer and Mitchell 1998, 2001; Bishop et al. 2001; 
Hamill et al. 2001; Whitaker and Hamill 2002; Keppenne and Rienecker 2002). In ensemble- 
based data assimilation schemes the ensemble of background forecasts is generated by using 
initial conditions distributed according to the result of the previous analysis. [In this paper 
we will not consider the issue of model error. How to appropriately incorporate model error 
in an ensemble Kalman filter, especially when accounting for the fact that such errors are 
temporally correlated, is still an open question.] 

The ensemble-based approach has the appeal of providing initial ensemble perturbations 
that are consistent with the analysis scheme. This is important because currently imple- 
mented operational techniques generate initial ensemble perturbations without use of direct 
information about the analysis errors (Toth and Kalnay 1997; Molteni et al. 1996). These 
techniques are obviously suboptimal considering the goal of ensemble forecasting, which is to 
simulate the effect of the analysis uncertainties on the ensuing forecasts. 

The main difference between the existing ensemble-based schemes is in the generation of 
the analysis ensemble. One family of schemes is based on perturbed observations (Evensen 
and van Leeuwen 1996; Houtekamer and Mitchell 1998, 2001; Hamill and Snyder 2000, 2001, 
Keppenne and Rienecker 2002). In this approach, the analysis ensemble is obtained by as- 
similating a different set of observations to each member of the background ensemble. The 
different sets of observations are created by adding random noise to the real observations. 
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where the random noise component is generated according to the observational error covari- 
ance matrix. The main weakness of this approach is that the ensemble size must be large 
in order to accurately represent the probability distribution of the background errors. Thus 
a relatively large forecast ensemble has to be evolved in time, limiting the efficiency of the 
approach. Recent papers have discussed how the required size of the ensemble can be reduced 
(Houtekamer and Mitchell 1998 and 2001; Hamill et al. 2001), e.g., by filtering the long 
distance covariances from the background field. 

The other family of schemes, the Kalman square-root filters, use a different approach to 
reduce the size of the ensemble. These techniques do the analysis only once, to obtain the 
mean analysis. Then the analysis ensemble perturbations (to the mean analysis) are generated 
by linearly transforming the background ensemble perturbations to a set of vectors that can 
be used to represent the analysis error covariance matrix. Thus, the analysis is confined to 
the subspace of the ensemble. This type of Kalman square-root strategy is feasible because 
the analysis error covariance matrix can be computed explicitly from the background and 
observational error covariance matrices. Since there is an infinite set of analysis perturbations 
that can be used to represent the analysis error covariance matrix, many different schemes 
can be derived following this approach (Tippett et al., 2002). Existing examples of the square 
root filter approach are the Ensemble Transform Kalman Filter (Bishop et al. 2001), the 
Ensemble Adjustment Filter (Anderson 2001), and the Ensemble Square-root Filter (Whitaker 
and Hamill 2001). 

The scheme we propose is a Kalman square- root filter ^. The most important difference 

between our scheme and the other Kalman square-root filters is that our analysis is done 

locally in model space. In a sense, our paper is related to previous work that attempted to 

construct a simplified Kalman Filter by explicitly taking into account the dominant unstable 

directions of the state space (Kalnay and Toth 1994; Fisher 1998). 

^The basic algorithm was first described in tlie paper Ott, E., B. H. Hunt, I. Szunyogh, M. Corazza, E. 
Kalnay, D. J. Patil, J. A. Yorke, A. V. Zimin, and E. Kostelich, 2002: "Exploiting local low dimensionality of 



the atmospheric dynamics for efficient Kalman filtering" |http://arxiv.org/abs/physics/0203058). 
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Our scheme is based on the construction of local regions about each grid point. The basic 
idea is that we do the analysis at each grid point using the state variables and observations in 
the local region centered at that point. The effect is similar to using a covariance localization 
filter (e.g., Hamill et al. 2001) whose characteristic length is roughly the size of our local 
regions. An outline of the scheme is as follows. 

1. Advance the analysis ensemble of global atmospheric states to the next analysis time, 
thus obtaining a new background ensemble of global atmospheric states. 

2. For each local region and each member of the background ensemble, form vectors of the 
atmospheric state information in that local region. (Section 2) 

3. In each local region, project the 'local' vectors, obtained in step 2, onto a low dimensional 
subspace that best represents the ensemble in that region. (Section 2) 

4. Do the data assimilation in each of the local low dimensional subspaces, obtaining 
analyses in each local region. (Section 3) 

5. Use the local analyses, obtained in step 4, to form a new global analysis ensemble. (This 
is where the square root filter comes in.) (Section 4) 

6. Go back to step 1. 

These steps are summarized, along with a key of important symbols that we use, in Figure Q 
and its caption. 

This method is potentially advantageous in that the individual local analyses are done in 
low dimensional subspaces, so that matrix operations involve only relatively low dimensional 
matrices. Furthermore, since the individual analyses in different local regions do not interact, 
they can be done independently in parallel. 

In the following sections we describe and test our new approach to data assimilation. 
Section 2 introduces the concept of local regions and explains how the dimension of the local 
state vector can be further reduced. Section 3 explains the analysis scheme for the local 
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regions. In section 4, the local analyses are pieced together to obtain the global analysis field 
and the ensemble of global analysis perturbations. Section 5 illustrates our data assimilation 
scheme by an application to a toy spatio-temporally chaotic model system introduced by 
Lorenz (1996). 

2 Local vectors and their covariance 

A model state of the atmosphere is given by a vector field x(r, t) where r is two dimensional 
and runs over discrete values Tjnn (the grid in the physical space used in the numerical compu- 
tations) . Typically, the two components of r are the geographical longitude and latitude, and 
X at a fixed r is a vector of all relevant physical state variables of the model (e.g., wind velocity 
components, temperature, surface pressure, humidity, etc., at all height levels included in the 
model). Let u denote the dimensionality of x(r, t) (at fixed r); e.g., when five independent 
state variables are defined at 28 vertical levels, u = 140. 

Data assimilation schemes generally treat x(r, i) as a random variable with a time-dependent 
probability distribution. The characterization of x is updated over time in two ways: (i) it 
is evolved according to the model dynamics; and (ii) it is modified periodically to take into 
account recent atmospheric observations. 

We do our analysis locally in model space. In this section we introduce our local coordinate 
system and the approximations we make to the local probability distribution of x(r, t) . Since 
all the analysis operations take place at a fixed time t, we will suppress the t dependence of 
all vectors and matrices introduced henceforth. 

Motivated by the work of Patil et al. (2001) we introduce at each point local vectors x^„ 
of the information :x.{rm+m',n+n',t) for — / < m',n' < I. That is, x^^ specifies the model 
atmospheric state within a (2/ -|- 1) by {21 + 1) patch of grid points centered at r^^. (This 
particular shape of the local region was chosen to keep the notations as simple as possible, 
but different (e.g., circular) shape regions and localization in the vertical direction can also 
be considered.) The dimensionality of x^^j is {21 + 1)^m. We represent the construction of 
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local vectors via a linear operator M^n; 

= M„„x(r,t). (1) 

We now consider local vectors obtained from the model as forecasts, using initial conditions 
distributed according to the result of the previous analysis, and we denote these by x,|^„ 
(where the superscript b stands for "background"). Let Fmni'^mn) be our approximation 
to the probability density function for xj^,„ at the current analysis time t. A fundamental 
assumption is that this probability distribution can be usefully approximated as Gaussian, 

where P|^„ and x^„ are the local background error covariance matrix and most probable state 
associated with Fmn(x^„). Graphically, the level set 

is an ellipsoid as illustrated in Figure |21 The equation of this probability ellipsoid is 

\ mn mnl \ mnl \ mn mnl \ I 



We emphasize that the Gaussian form for the background probability distribution, Fmn(x^„), 
is rigorously justifiable only for a linear system, but not for a nonlinear system such as the 
atmosphere. 

As explained subsequently, the rank of the {21 + l)^-u by (2/ + l)^-u covariance matrix Pj^^ 
for our approximate probability distribution function Fmn is much less than {21 + Vfu. Let 

A; = rank(PL); (5) 

(fc = 2 in Figure|2I). Thus P^^ has a (2/ + l^u — k dimensional null space E>mn and the inverse 
(P^„)~^ is defined for the component of the vectors (x^^ — x^„) lying in the k dimensional 
subspace S^n orthogonal to S^n 

In the data assimilation procedure we describe in this paper, the background error co- 
variance matrix Pj^„ and the most probable background state x|^„ are derived from a k' + 1 
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member ensemble of global state field vectors {x''*^*^(r, t)}, i = 1,2, ■ ■ ■ , k' + 1; k' > k > 1. 
The most probable state is given by 

2=1 

To obtain the local background error covariance matrix that we use in our analysis, we 
first consider a matrix given by 

fc'+i 

1=1 

where the superscribed T denotes transpose, and 

<5x^2 = M^„x''«(r,t)-xL(r,t). (8) 
It is also useful to introduce the notation 

in terms of which (|7j) can be rewritten, 

mn mn mn' V ^/ 

We assume that forecast uncertainties in the mid-latitude extra-tropics tend to lie in a low 
dimensional subset of the (2/ + dimensional local vector space ^. Thus we anticipate 
that we can approximate the background error covariance matrix by one of much lower rank 
than {21 + and this motivates our assumption that an ensemble of size of k' + 1, where 
A;' + 1 is substantially less than {21 + will be sufficient to yield a good approximate 

representation of the background covariance matrix. Typically, P^^ has rank k', i.e., it has 
k' positive eigenvalues. Let the eigenvalues of the matrix P|^^ be denoted by Ami, where the 
labeling convention for the index j is 

A« > A(2) > ... > A^'^) > ■■■ > A(^'\ (11) 

^mn — 'mn — — 'mn — — 'mn \ J 

^Preliminary results with an implementation of our data assimilation scheme on the NCEP GFS supports 
this view. 
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Since P^n is a symmetric matrix, it has k' orthonormal eigenvectors {umn} corresponding to 
the k' eigenvalues (fTTj). Thus 

k' 

P^' = V A^-'') u^-'') fu^^'h^ (12) 

mn / j ^mn mn\ mn) ' \ ) 

i=i 

Since the size of the ensemble is envisioned to be much less than the dimension of x^„, 
[k! + 1) ^ (2/ + the computation of the eigenvalues and eigenvectors of P^„ is most 
effectively done in the basis of the ensemble vectors. That is, we consider the eigenvalue 
problem for the (/c' + 1) x (A;' + 1) matrix X|^„X^„, whose nonzero eigenvalues are those of 
P-mn [HI whose corresponding eigenvectors left-multiplied by Xj^„ are the k' eigenvectors 
Umn of Pm„. We approximate P|^„ by truncating the sum at k < k' 

k 

mn / J '^mn mn\ mn) " V / 

In terms of Umn and Amn, the principal axes of the probability ellipsoid (Figure 121) are given 
by 

\[y^nvS^^. (14) 

The basic justification for the approximation of the covariance by Pj^„ is our supposition that 



for reasonably small values of A;, the error variance in all other directions is much less than 
the variance, 

k 

EaHL (15) 
i=i 

in the directions {umn}, j = 1, 2, . . . , fc. The truncated covariance matrix P|^„ is determined 
not only by the dynamics of the model but also by the choice of the components of (5xmn. 
In order to meaningfully compare eigenvalues. Equation ()11|). the different components of 
5xmn (e.g., wind and temperature) should be properly scaled to ensure that, if the variance 
(fT3j) approximates the full variance, then the first k eigendirections, {umn}, j = 1, 2, . . . , fc, 
explain the important uncertainties in the background, x^„. For instance, the weights for the 
different variables can chosen so that the Euclidean norm of the transformed vectors is equal 



to their total energy norm derived in Talagrand (1981). In what follows, we assume that the 
vector components are already properly scaled. (We also note that ii k = k', the comparison 
of eigenvalues is not used and thus such a consistent scaling of the variables is not necessary.) 

For the purpose of subsequent computation, we consider the coordinate system for the 
k dimensional space determined by the basis vectors {u^n}- We call this the internal 
coordinate system for E>mn- To change between the internal coordinates and those of the local 
space, we introduce the {21 + l^u by k matrix, 

O = lu^^Mu^^) I • • • lu^'')]- (16) 

We denote the projection of vectors into S^n and the restriction of matrices to Smn by a 
superscribed circumflex (hat). Thus for a {21 + l^u dimensional column vector w, the vector 
w is a A; dimensional column vector given by 

w = QLw. (17) 

Note that this operation consists of both projecting w into and changing to the internal 
coordinate system. Similarly, for a {21 + l^u by {21 + l^u matrix M, the matrix M is /c by 
k and given by 

M = QLMQ^„. (18) 

To go back to the original {21 + l^u dimensional local vector space, note that Q^„Qmn = I 
while QmnQmn represents projection on §^„, i.e., it has null space E>mn and acts as the identity 
on Smn- We may write w as 

w = w(ll)+w(^), (19) 

w(ll) = a£w = Q„„w, w(^) = A(^2w, (20) 

where w*^") and w*^-*--' denote the components of w in Smn and S^n, respectively, and the 
projection operators A^n and A^ are given by 

■^mn ~ QmnQmn' ^rnn ^ QmnQmn" (21) 
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In addition, if M is symmetric with null space S, 



(22) 



Note that Pt 



mn 



is diagonal, 




(23) 



and thus it is trivial to invert. 

3 Data assimilation 

With Section 2 as background, we now consider the assimilation of observational data to 
obtain a new specification of the probability distribution of the local vector. In what follows, 
the notational convention of Ide et al. (1997) is adopted whenever it is possible. 

Let x^^ be the random variable at the current analysis time t representing the local 
vector after knowledge of the observations and background mean are taken into account. For 
simplicity, we assume that all observations collected for the current analysis were taken at the 
same time t. Let yj^^ be the vector of current observations within the local region, and assume 
that the errors in these observations are unbiased, are uncorrelated with the background, and 
are normally distributed with covariance matrix Rmn- An ideal (i.e., noiseless) measurement 
is a function of the true atmospheric state. Considering measurements within the local region 
(m, n), we denote this function 7imn{-)- That is, if the true local state is xj^;^^, then the 
error in the observation is — Hmni^mn)- Assuming that the true state is near the mean 
background state x^^, we approximate TCmni^mn) by linearizing about x^„. 




.a 

■mm 



(24) 



where 



Ax; 



mn 



= X 



mn 



.a 




(25) 
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and the matrix is the Jacobian matrix of partial derivatives of Timn evaluated at x^„. 
(If there are s scalar observations in the local {21 + 1) by (2/ + 1) region at analysis time t, 
then y^^ is s dimensional and the rectangular matrix Hmn is s by {21 + Then, since 

we have assumed the background (pre-analysis) state to be normally distributed, it will 
follow below that x^^ is also normally distributed. Its distribution is determined by the most 
probable state x^„ and the associated covariance matrix P^„. The data assimilation step 
determines x^„ (the local analysis) and (the local analysis covariance matrix). 

Since our approximate background covariance matrix P^^ has null space E)mn, we con- 
sider the analysis increment component Ax^il'' = Al!i]i(x^„ — x5'„„) within the fc-dimensional 
subspace S^^, and do the data assimilation in S^^. Thus the data assimilation is done by 
minimizing the quadratic form, 

J(Ax^J = (Ax^J^(PL)-'Ax^„ 

+ (HmnAx^„ + 7imn{^mn) ~ ymn) ^mn ^ 

(H„„AX^„ + nUK^n) - ymn)- (26) 

Here = H^^Q^^ maps E>mn to the observation space, using the internal coordinate 

system for E>mn introduced in the previous section, so that Ax^n = Qr„„Ax^„. The most 
probable value of Ax^„, 

A^mn ~ ^mn^mn^mniymn ~ '^mni^mn))^ (27) 

is the minimizer of J( Ax^^) , where the analysis covariance matrix P^^ is the inverse of the 
matrix of second derivatives (Hessian) of J(Ax^^) with respect to Ax^^, 

Pmn = [(Pmn) ^ + ^Zin^mn^mn] ^ (28) 

For computational purposes, we prefer to use the alternate form, 

■^mn ~ ■f*mn[-'- '^mn^mn^mn^ mn\ ' (29) 
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both in place of (j28j) and in computing (j27|) . A potential numerical advantage of (|29|) over 
is that involves the inverse of P^n? which may be problematic if has a small 
eigenvalue. 

Another alternative is to compute (j27|) and (j28j) in terms of the "Kalman gain" matrix 
Then it can be shown (e.g., Kalnay 2002, p. 171) that (jZZD and (j2HD/(EnD are equivalent to 
and 

-Pmn ~ (I ~ Km7iHmn)Pm,ri- (32) 

Again, the inverse of Pj^„ is not required. 

Though (P7j) and are mathematically equivalent to (jHUjl - lfH^ . the former approach 
may be significantly more efficient computationally for the following reasons. In both cases, 
one must invert an s by s matrix, where s is the number of local observations. While these 
matrices are considerably smaller than those involved in global data assimilation schemes, they 
may still be quite large. Generally the s by s matrix Rmn whose inverse is required in ()29|) will 
be diagonal or close to diagonal, and thus less expensive to invert than the matrix inverted 
in (|3U|). (Furthermore, in some cases one may be able to treat Rmn as time-independent and 
avoid recomputing its inverse for each successive analysis.) The additional inverse required in 
()29|) is of a A; by matrix, where k < k' may be relatively small compared to s if the number 
of observations in the local region (m, n) is large. 

Finally, going back to the local space representation, we have 

^mn ~ Q,mn/^^mn "I" ^mn- (33) 
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4 Updating the ensemble 

We now wish to use the analysis information, and x^„, to obtain an ensemble of global 
analysis fields {x"*^*^(r, t)}; i = 1,2, ■ ■ ■ , k' + I. Once these fields are determined, they can be 
used as initial conditions for the atmospheric model. Integrating these global fields forward 
in time to the next analysis time t + At, we obtain the background ensemble {x^^*^(r, t + At)}. 
This completes the loop, and, if the procedure is stable, it can be repeated for as long as 
desired. Thus at each analysis time we are in possession of a global initial condition that can 
be used for making forecasts of the desired durations. 

Our remaining task is to specify the ensemble of global analysis fields {x"'^*)(r, t)} from 
our analysis information, P^^ and xj^„. Denote {k' + 1) local analysis vectors by 

a{i) ^ -a I ca{i) (oA\ 
mn van ' " van ' V" / 

Using (Uni) and we write 

_ ^ a(i)(||) I r a(i)(±) _ Q X-J-a(j) i X a{i)(l.) /ocN 
^■^mn ~ ^-^mn ' '^■^mn ~ ^ran^-^mn ' '^■^mn ■ W'-'/ 

In addition, we let 

because our analysis uses the observations only to reduce the variance in the space Smn, 
leaving the variance in Smn unchanged. (We note, however, that by our construction of E>mn 
in section 2, the total variance in S^n is expected to be small compared to that in Smn- Also, 
in the case k = k' all members of the analysis perturbation ensemble will lie in S^n, so that 
projection onto S^n is superfluous, and Sx.mn^'^^ in and the term Amn^Xmn in (jHTjl (below) 
may be omitted.) Combining (j^UI) and (jMjl - ljHBj) . we have 

^van ~ ^mn ^ ^mn^-^mn ' ^^mn'-'^van- W/ 

We require that 

fe'+i 

E = 0, (38) 
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which, by virtue of and (from © and 

E5x^t) = 0, (39) 
1=1 

is equivalent to 

fc'+i fc'+i 

E ^^-^"^ = E = 0- (40) 
fi=l j=l 

Thus we require that 

fc'+i 

E = 0- (41) 
1=1 

In addition, is given by 

fc'+i 

P'Ln = k'-'J2^^^^^^- (42) 
j=l 

Hence the local analysis state (determined in Section 3) is the mean over the local 
analysis ensemble {xmn}, and, by (jl^ . {5xmn} gives a representation of the local analysis 
error covariance matrix. We now turn to the task of determining the analysis perturbations 
{6x.mn}. Once these are known {xmn} is determined from (jHTj) . 

4.1 Determining the ensemble of local analysis perturbations 

There are many choices for {(5xmn} that satisfy (jH} and (02)), and in this section we will 
describe possible methods for computing a set of solutions to these equations. (See also 
Tippett et al. (2002) for different approaches to this problem in the global setting.) In a 
given forecasting scenario, one could compare the accuracy and speed of these methods in 
order to choose among them. There are two main criteria we have in mind in formulating 
these methods. 

First, the method for computing {6^mn} should be numerically stable and efficient. Sec- 
ond, since we wish to specify global fields that we think of as being similar to physical fields, 
we desire that these fields be slowly varying in m and n. That is, if P^n is slowly varying, we 
do not want to introduce any artificial rapid variations in the individual (5xmn through our 
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method of constructing a solution of (j41|) and (j42j) . For this purpose we regard the background 
vectors as physical states, and hence slowly varying in m and n. (This is reasonable since the 
background ensemble is obtained from evolution of the atmospheric model from time t — At 
to time t.) 

Thus we are motivated to express the analysis ensemble vectors 6x.mn as formally linearly 
related to the background ensemble vectors. We consider two possible methods for doing this. 
In the first method, we relate 6x.mn to the background vector with the same label i, 

where 

(Note that the apparent linear relation between the background and analysis perturbations in 
is only formal, since our solution for Z^n will depend on the background perturbations.) 
In the second method, we formally express (5xmn as a linear combination of the vectors, (5xmn , 



where 



^mn ~ (^ ) ^ {^^mn ^ l^^mn ^| " " " \6'kj^^ -*}. (46) 

Using pUj) the analysis and the background covariance matrices can be expressed as 

P-^'^ = X^An. (47) 

The k X k matrix Z^n or the {k' + 1) x [k' + 1) matrix Y^n can be thought of as a generalized 
'rescaling' of the original background fields. This 'rescaling' can be viewed as being similar to 
the techniques employed in the breeding method (Toth and Kalnay, 1993) and in the Ensemble 
Transform Kalman Filter approach (Bishop et al., 2001; Wang and Bishop, 2002). If Z^n or 
Ymn vary slowly with m and n, then by and so will Sx." 
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Considering we see that PT|) is automatically satisfied because, by and (jH)), the 
background perturbations Sx-ml sum to zero, 

X^nV = 0, (48) 

where v is a column vector of {k' + 1) ones. The analysis perturbations given by will 
satisfy (jl2|l [equivalently (jUj)] if, and only if, 

= Z P'' (49) 

^ mn ^mn^ mn^mn' l^"'/ 

Considering (|^. we see that (jTTj) yields the following equation for Y^^ 

pa _ -vb -y- -y-T -jrbT ( ^0) 

mn -^mn ^ mn ^ mn-^mn' ) 

Unlike PHj) for Z^n, does not imply automatic satisfaction of (jH)). We note that (PT|) 
can be written as 

X^„v = 0. (51) 
Thus, in addition to (j5Up . we demand that Y^^ must also satisfy 

XLY„„v = 0. (52) 

Equation (pUj) has infinitely many solutions for Z^n- Similarly, equations (jSO)) and 
have infinitely many solutions for Y^n- In order for the results to vary slowly from one grid 
point to the next, it is important that we use an algorithm for computing a particular solution 
that depends continuously on PJ^^ and P\ 



mn' 



4.2 Solutions of Equation 
4.2.1 Solution 1 

One solution Zm„ is 



Zrran ^ mri) ^ mri) i (^3) 
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where in (j53|) . by the notation M^/^, we mean the unique positive symmetric square root of 
the positive symmetric matrix M. In terms of the eigenvectors and eigenvalues of M, the 
positive symmetric square root is 

k 

M^/^ _ J2 v^m(^')(m(^'))^, (54) 
i=i 

where 

Mm(^') = z/(^')m(^'). (55) 
Recall that is diagonal, so that its inverse square root in is easily computed. 

4.2.2 Solution 2 

Pre- and post-multiplying (jl^ by (Pmn)^^^ ^^'^ taking Zmn to be symmetric, 

f-pb \l/2r7 f-pb U/2i2 _ /-pfc \l/2-pa ffjb \l/2 /p-^N 
\^ mn) ^mny^ mnJ J ~ \^ mn) ^ mn\^ mnJ ■ W^y' 

Taking the positive symmetric square root of (jSHI), we obtain a second possible solution of 



-1/2 



f-pb U/2pa /-pb \l/2 
\ mn) mn\ mn) 



(57) 



In contrast to solution 1 (given by (jSHI)) and solutions 3 (given below), this solution yields a 
Zmn that is symmetric, Z^n = Z^„. 

4.2.3 Family of solutions 

We can create a family of solutions for Z^n by introducing an arbitrary positive symmetric 

^^^^ -j^ /2 

matrix Dm„ and by pre- and post-multiplying (pUj) by Dmn • This yields 



where 

Zmn = '^mn '^mrJ^mm (59) 

pa,6 ^ j3-l/2pa,6j3-l/2 /ggx 

mn mn mn van V^"/ 



17 



Applying solution 2 to (fHHj) we obtain (j^Tj) with Z^n and P^^ replaced by Z^n and P^^^ Then, 
applying and (plj) . we find that the unique solution to such that Dmn^^ZmnDmn is 
symmetric is 

z =D^/2rp^ r^/^[fp'' r^/^p" fp^ r^^^l^^^p^ r^/^D-^/^ /gl^ 

^mn ^mn\-^ mnJ \^ mn) ^ mn\^ mn) V-^ mn) ^mn ■ \^-^) 

Thus for any choice of D^n we obtain a solution Z^n of (HHI), and this is the unique solution 

]^ /2 1/2 

for Zjnn subject to the added condition that Dmn Zmn^^mn is symmetric. 
Another way to generate a family of solutions is to replace (jK!^ by 



Zmn — V PmnV (Pmn) ) (62) 



where for a positive definite symmetric matrix M, we mean by vM any matrix for which 
-\/M-\/M = M. Note that this equation does not uniquely determine a/M, and that given 
any solution v^M = W, the most general solution is v^M = WO where O is any orthogonal 
matrix. In particular, the positive symmetric square root (which we denote M^/^) is a specific 
choice for V^M, and, in general, V^M = M^^^O. Furthermore, by considering all possible 



matrices yP^n obtain all possible solutions Z^n of Thus we can write a general 

solution of pnj) as 

Zmn = (Pmn) ^mn{Pmn) ) (63) 

where Omn is an arbitrary orthogonal matrix. (Note that Omn can be a function of P^n 
Pmn-) ^'^^ further discussion see Appendix A. 

The family of solutions of (j49|) generated by with different D^n is smaller than the 
family given by with different Omn- In particular, the family ()fi3|l . being the most 
general solution of ()49|). must contain the family corresponding to ()6H). To see that the latter 
family is indeed smaller than the former family, consider the special case, P^n — ^mn- 
Pmn = Pmn' © always givcs Zmn = I, whilc givcs 

Z = fP" ]-^/^0(°) fP" ^1/2 /g^N 
^mn \^ mn) ^mn\^ mn) i \^^) 

which is never I unless the orthogonal matrix Omn is I. (Here Omn denotes O^n evaluated 
at Pmn = Pmn-) Bascd ou our treatment in section 4.3, we believe that the smaller family. 



given by (|HT|) with different Dmn, gives results for that are more hkely to be useful for 
our purposes. 



4.2.4 Solution 3 

Subsequently, special interest will attach to the choices Dmn = Pmn D^n = Pmn (EU- 
Although these two choices yield results from (jUUjl and that appear to be of quite different 
form, the two results for Z^n are in fact the same. We call this solution for Z^n solution 3. 
To see that these two Dmn choices yield the same Z^n, we note that can be put in the 
form, 



pa U/2 /pa \-l/2y /p 
^ mn) \^ mn) ^mn\^ 



a \l/2 
mn) 



,a \l/2 _ /-pfe \l/2 



pb Ni/^ (-ph \-l/2y /pb N 
^ mn) \^ mn) ^mn\^ mn) 



b ^1/2 
mn 



yb a/2 



(65) 



Thus symmetry of (Pmn) ^''^2mn(Pmn)^^^ (required by the choice D^n = Pmn (ED)) implies 



symmetry of (P^„)~^^^Zmn(Pm„)^^^ (i.e., D^n = Pm„ in (jUT|) ) and vice versa. Hence, the 
two choices for Dmn necessarily yield the same Zmn- Explicitly, setting Dmn = Pmn (EH) 
we can write solution 3 as 



b a/2 



(P ) 

V mnJ 



yb N-l/2pa fryb N-1/2 
mn) mn\ mn) 



1/2 



yb \-l/2 



(66) 



As discussed subsequently, alternate formulations exist for which solution 3 does not require 
inverting Pmn (see Equations (77), (79), and (80)). This may be advantageous if Pmn has 
small eigenvalues. 



4.3 'Optimal' choices for Zmn 

Since we think of the background ensemble members as physical fields, it is reasonable to 
seek to choose the analysis ensemble perturbations 6x.mn in such a way as to minimize their 
difference with the background, 

fc'+i fc'+i 

^i^^mn) ~ ^ ] ll'^^rin ~ ^-^mn II ~ ^ ] i^'^mn ~ ^'^mn] l^^mn ~ ^^mn]y (^7) 
i=l 1=1 
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subject to the requirement that (j42j) be satisfied. Thus, introducing a k x k matrix of 
Lagrange multiphers, we form the following quantity, 

fc'+l k ^ fc'+l 

/: = ff^x"(') - f^x^(*)l'^ff^x''(*) - (^x^(*)l - fB 1 lip"- ) - — f^x"(*)Wf^x" 



j^/ / J \""mn/p\ mnJq 

i=l Pi<?=l *=1 



(68) 

which we minimize with respect to 6x.mn and Bmn- Forming the first and second derivatives 
of C with respect to 5xmn, we have 

1 dC 



2 

(JUJi-mn 

1 92/: 



7-1 x-^aCO _ x-^M*) (aq) 



Z^L (70) 



where we have defined Z_i as 



Zmn = I + ^(Bmn + B^„). (71) 

Since £ is stationary, (j69j) implies (j43p . and the derivative with respect to B^^ returns (j42p . 
Since C is minimum, (fTOj) implies that Z^n is positive, while (f7T|) gives Z^n = Thus 
the solution that minimizes jF(xmri) is obtained from the unique symmetric positive solution 
for Zmn- This is given by solution 2 (fHTj) . 

It is also of interest to consider different metrics for the distance between the analysis 
ensemble {(5xmn} and the background ensemble {5xmn}- Thus we minimize the quadratic 
form, 

fc'+i fc'+i 

i=l i=l 

where the positive symmetric matrix T)mn specifies the metric. (The quadratic form jF(5xmn) 
is the special case of jFo((5xmn) when the metric is defined by the identity matrix, D^n = I)- 
The introduction of the metric matrix D^n is equivalent to making the change of variables, 
= (Dmn)^^^^X^^. Inserting this change of variables in (jFFj) . we obtain (jHH). 
Solution 3, namely D^n equal to P^n Pmn^ appears to be favorable in that it provides 
a natural intuitive normalizations for the distance. We thus conjecture that solutions 3 may 
yield better performance than solutions 1 and 2. 
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4.4 Solution of dSHl) and (15^ 

Another way of solving for the analysis fields is to use the 'Potter method' (e.g., Biermann 
1977). To see how this solution is obtained, let 

SO that (j^mi becomes 

Because P^n is x and Amn is [k' + 1) x [k' + 1), there is a lot of freedom in choosing 
Amn- It seems reasonable that, if the analysis covariance and the background covariance are 
the same (i.e., PJ^^ = P^n)' then the ensemble analysis perturbations should be set equal to 
the ensemble background perturbations: 

Ymn = 1 if Pmn ~ ^mrf C^^) 

A solution for Amn consistent with (f7S j) - (f7Sj) is 

A = T -I- X^-^ fP^ ^^"^fP" — P** UP^ 'l^'^X'' (7fi) 

^^mn ^ ' -^mny mn) L mn mn\\ mn) van' V ' ^/ 

This solution for A^n is symmetric and can also be shown to be positive definite. Equation 
(fTUj) yields Amn = I if P^n = Pmn^ required by (fTSj) and (fTSj) . and satisfaction of (fTIj) by 
(fTUj) can be verified by direct substitution and making use of P.|^„ = Xj^^Xj^. From (fTSj) 



we have Ymn = VAmn, and, if the positive symmetric square root is chosen, then (f7H|l is 
satisfied. Thus we have possible solution 

V _ U/2 f-7-7\ 

^ mn y-'^mnj ■ \' ' J 

It remains to show that (fTBj) and (f77|) also satisfies (jH21)- By (fTBj) and (pSj) we have A^nV = v; 
i.e., V is an eigenvector of Amn with eigenvalue one. Since the positive square root is employed 
in (f77|) V is also an eigenvector of Ymn with eigenvalue one. Hence X^^Y^^v = X|^^v, which 
is identically zero by PH|l. thus satisfying (j^ . 
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Potter's expression for A^n is obtained by using (jHUI) and in (jTB)), 

For (j77j) and (j78|) the square root is taken of a A;' + 1 by A;' + 1 matrix, but the inverse is of 
an s by s matrix, where s is the dimension of the local observation space. An equivalent way 
to write (f75|) in our setting is 

where 

"^mn [I "I" '^mn^mn^rnnP mn] ■ (^0) 

Now aside from R^n; we need only invert a. khj k matrix. As previously discussed, although 
Hmn is s by s, its inverse is easily computed even when s is much larger than k. 

We now ask whether each solution Zmn of PU)) has a corresponding Y^n such that ZmnX|^„ 
and X^„Ymn yield the same result for X^„. To see that they do, we note that the matrix 
X,|^„ (which consists of k rows and k' + 1 columns) has a (nonunique) right inverse (Xj^^)^^ 
such that X[^„(X|^„)""^ = 1^, where 



O^mn) ~ ■^mnO^mn^mn) + ^mn — '^mni^ mn) ~^ ^mn, (81) 



and Emn is any k x {k' + 1) matrix for which X^^E^^ = Omn- Thus, from X;^„ = Z^nX^^, 
we have 

~ -X^„(X^^) Z^^X^^. (82) 
From the definition of Ymn, = X^^^Y^n, we see that (jH^ and (jHT|) yield 

Ymn '^mni^mn) '^mn^rnn ~^ (^3) 

where G^n is any (fc' + 1) by {k' + 1) matrix satisfying X^„Gmn = 0. Since we desire that 
Ymn = Ifc'+i, when Z^n = Ifc, a possible choice for G^n is 

Gmn = Ifc'+l ~ X^„(P^„) X^^. (84) 
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(We note that Gmn given by (jS^ is a projection operator, {GmnY = Gmn for any integer 
exponent p.) Thus from (jH^ and (jHH), a Ymn corresponding to any solution Z^n (e.g., 
solution 1, 2 or 3) is 

Ymn = X^„(P^„) ^(Zmn — Ia:)X^„ + (85) 

Using (jH3), and (jIHj) it can be verified that Ym„Y^„ = A^n with A^n given by (fTUI) . 
Thus Ym-nYmn scmg {k' + 1) X (k' + 1) matrix for all solutions Z^n (e.g., solutions 1,2, 

and 3). The general solution of Ym,nY^„ = A^n is 



\/ -^mn (A^n) ^ Omnj (^6) 

where O^n is an arbitrary orthogonal matrix. However, to ensure that is satisfied we 
also require that O^nV = ±v (where v is a column vector of {k' + 1) ones); i.e., that v is 
an eigenvector of Omn with eigenvalue ±1.. For example, Omn can be any rotation about 
V. Thus there is still a large family of allowed orthogonal matrices Omn- (Note that Omn 
can depend on P^^ and P^n' ^'^d that for (fTSj) to be satisfied, Omn must be I whenever 
Pmn = PmnO Heucc wc cau think of the various solutions for Ymn either as being generated 
by (jF)T|) and with different choices for the metric matrix Dmn, or as being generated by 
()76|) and ()86|) with different choices for the orthogonal matrix Omn- 

Note that since (P5^„)^^Zmn is symmetric for solution 3 (e.g., see the resulting Y^n 

from (jHH|l is symmetric and must therefore coincide with ()77|1 . That is, Zmn^tm with Z^n 
given by and X^nYmn with Ymn given by (fTHj) and (f77j) both yield the same result for 
X^„. Also, in Appendix B we show that Y^n as given by (jH3j) can be used to directly obtain 
the analysis (5xmn (note the absence of the superscribed circumflex on 5x^*-*^'' 



^mn 



4.5 Construction of the global fields 

Regardless of which of these solution methods for {Sx.mn} is chosen, by use of (j37| we now 
have {k' + 1) local analyses Xmn at each point r^n, and it now remains to construct an 
ensemble of global flelds {x"*^*)(r, t)} that can be propagated forward in time to the next 
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analysis time. There are various ways of doing this. The simplest method is to take the state 
of the global vector, x"*-*\ at the point Tmn directly from the local vector, Xmn, for the local 
regions centered at r^n. This approach uses only the analysis results at the center point of 
each local region to form the global analysis vectors. Another method (used in our numerical 
example of section 5) takes into account atmospheric states at the point r^^ obtained from 
all the local vectors x^^,^, (|m'| < /, \n'\ < I) that include the point Tmn- In particular, 
these states at r^n are averaged to obtain x°'^^\r,t). In forming the average we weight the 
different local regions with weights depending on (m', n') such that the weights decrease away 
from Tmn- The motivation for this is that, if we obtain from Xmn state estimates at points 
in the local region mn, then the estimates may be expected to be less accurate for points 
toward the edges of the local region. Note that such averaging to obtain x°'^^\rmn,'t) has the 
effect of gradually decreasing the influence of observations that are further from the point 
Vmn at which x''*^*)(rm„, t) is being estimated. In order to illustrate this, consider the case 
where the weights are equal for \m'\ < /', \n'\ < I', where /' < /, and zero for / > |m'| > 
I > \n'\ > I' (this is what we do in Section 5). That is, we only average over states obtained 
from local regions whose centers are within an inner {21' + 1) x [21' + 1) square contained 
in the (2/ + 1) x (21 + 1) local region {m,n), and we give those inner states equal weights. 
See Figure 121 for an illustration of the case I = 5, I' = 2. For the example in Figure 121 an 
observation at a point within the 7x7 inner square would be contained within all the 25 local 
regions centered at points within the inner 5x5 square, but observations outside the inner 
7x7 square are contained in fewer of the local regions, and observations outside the 21 x 21 
square centered at point mn are in none of the 25 local regions. Note also that in the case 
/' = 0, we use only the analysis at the center point of the local region (no averaging). We do 
not believe that there are any universal best values for the weights used to form the average; 
in a particular scheme, the parameter /' can be varied to test different weightings, or more 
general weights can be considered. 
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4.6 Variance inflation 

In past work on ensemble Kalman filters (Anderson and Anderson 1999; Whitaker and Hamill 
2002) it was found that inflating the covariance (P" or P^) by a constant factor on each 
analysis step, leads to more stable and improved analyses. One rationale for doing this 
is to compensate for the effect of finite sample size, which can be shown to, on average, 
underestimate the covariance. In addition, in Section 5 and Appendix C we will investigate the 
usefulness of enhancing the probability of error in directions that formally show only very small 
error probability (i.e., eigendirections corresponding to small eigenvalues of the covariance 
matrices). Following such a modification of P^„ or P^„, for consistency, we also make 
modifications to the ensemble perturbations 6x.mn or 5xmn so as to preserve the relationship 
(jT7|) . (Again, similar to the discussion in Section 4.2.3, the choice of these modifications is 
not unique.) 

In our numerical experiments in section 5 we will consider two methods of variance infla- 
tion. One method, which we refer to as regular variance inflation, multiplies all background 
perturbations 5xmn by a constant (1 + 6). This corresponds to multiplying P^„ by (1 + 5)^. 
This method has been previously used by Anderson and Anderson (1999) and by Whitaker 
and Hamill (2002). In addition to this method, in Appendix C we introduce a second variance 
inflation method, which, as our results of section 5 indicate, may yield superior performance. 
We refer to this method as enhanced variance inflation. 

5 Numerical experiments 
5.1 Lorenz-96 model 

In this section we will test the skill of the proposed local ensemble Kalman Filter scheme 
by carrying out Observing System Simulation Experiments (OSSE's) on the Lorenz-96 (L96) 
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model (Lorenz 1996; Lorenz and Emanuel 1998), 

— 7— = i^m+l ~ Xm-2)Xm-l ~ + F- (87) 

at 

Here, m = 1,---,M, where = xm-i, Xq = xm, and xm+i = a^i- This model mimics 
the time evolution of an unspecified scalar meteorological quantity, x, at M equidistant grid 
points along a latitude circle. We solve (jSTjl with a fourth-order Runge-Kutta time integration 
scheme with a time step of 0.05 non-dimensional unit (which may be thought of as nominally 
equivalent to 6-h in real world time assuming that the characteristic time scale of dissipation 
in the atmosphere is 5-days; see Lorenz 1996 for details). We emphasize that this toy model, 
(jHZj), is very different from a full atmospheric model, and that it can, at best, only indicate 
possible trends and illustrate possible behaviors. 

For our chosen forcing, -F = 8, the steady state solution, x^ = F for m = 1, ■ ■ ■ , M, in (jHTj) 
is linearly unstable. This instability is associated with unstable dispersive waves characterized 
by westward (i.e., in the direction of decreasing m) phase velocities and eastward group 
velocities. Lorenz and Emanuel (1998) demonstrated by numerical experiments for F = 8 
and M = 40 that the x field is dominated by a wave number 8 structure, and that the system 
is chaotic; it has 13 positive Lyapunov exponents, and its Lyapunov dimension (Kaplan and 
Yorke 1979) is 27.1. It can be expected that, due to the eastward group velocities, growing 
uncertainties in the knowledge of the model state propagate eastward. A similar process can 
be observed in operational numerical weather forecasts, where dispersive short (longitudinal 
wave number 6-9) Rossby waves, generated by baroclinic instabilities, play a key role in the 
eastward propagation of uncertainties (e.g., Persson 2000; Szunyogh et al. 2002; and Zimin 
et al. 2003). 

We carried out experiments with three different size systems (M = i x 40, i = 1,2, 3) and 
found that increasing the number of variables did not change the wavelength, i.e. the x fields 
were dominated by wavenumber i x 8 structures. 
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5.2 Rms analysis error 

The 40-variable version of the L96 model was also used by Whitaker and Hamill (2002) to 
validate their ensemble square root filter (EnSRF) approach. In designing our OSSE's we 
follow their approach of first generating the 'true state', m = 1,---,M, by a long 

(40,000 time-step) model integration; then first creating 'observations' of all model variables 
at each time step by adding uncorrelated normally distributed random noise with unit variance 
to the 'true state' (i.e., = I)- (The rms random observational noise variance of 1.00 is 
to be compared with the value 3.61 of the time mean rms deviation of solutions, Xm{t), of 
()87|) from their mean.) We found that our results were the same for Gaussian noise and for 
truncated Gaussian noise (we truncated at three standard deviations). The effect of reduced 
observational networks is studied by removing observations one by one, starting from the full 
network, at randomly selected locations. The reduced observational networks are fixed for all 
experiments. That is, the difference between a network with O observations and another with 
+ 1 observations is that there is a fixed location at which only the latter takes observations. 

The observations are assimilated at each time step, and the accuracy of the analysis is 
measured by the time mean of the rms error. 



5.3 Reference data assimilation schemes 

In order to the assess the skill of our data assimilation scheme in shadowing the true state, 
we considered three alternative schemes for comparison. 

5.3.1 Full Kalman filter 

For the sake of comparison with our local ensemble Kalman filter results, we first estab- 
lish a standard that can be regarded as the best achievable ensemble Kalman filter result 
that could be obtained given that computer resources placed no constraint on computa- 



E 




1/2 




m=l 
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tions of the analysis. (In contrast with operational weather prediction, for our simple M- 
variable Lorenz model, this is indeed the case.) For this purpose, we considered the state 
x(t) = {xi{t),X2{t),- ■ ■ ,XM{t)) on the entire domain rather than on a local patch. Then 
several Kalman filter runs were carried out with different numbers of ensemble members. In 
these integrations, full {k') rank estimates of the covariance matrices were considered and 
the ensemble perturbations were updated using (fTUj) . (f77j) . and of Appendix B. (see 
Section RT^ . 

We found that stable cycling of the full ensemble Kalman filter requires increasing vari- 
ance inflation when the number of observations is reduced, even if several hundred ensemble 
members are used (e.g., the assimilation of 21 observations required 2% variance inflation). 
This suggests that variance inflation is needed, not to compensate for sampling errors, but to 
correct for variance lost to nonlinear effects. 

It can be seen that by increasing the number of ensemble members the time mean of E 
converges to 0.20 regardless of M (figure^. The only difference between the different size 
systems (characterized by different values of M) is that more ensemble members are required 
to reach the minimum value for the larger systems. We refer to 0.2 as the "optimal" error, 
and we regard it as a comparison standard for our local Kalman filter method. (However, we 
note that it is not truly optimal since Kalman filters are rigorously optimal only for linear 
dynamics.) 

5.3.2 Conventional method 

We designed another comparison scheme that we call the conventional method, to obtain an 
estimate of the analysis error that can be expected from a procedure analogous to a 3D- 
Var scheme adapted to the L96 model. In this scheme, only the best estimate of the true 
state is sought (not an ensemble of analyses) using a constant estimate of the background 
error covariance matrix that does not change with time or position. This background error 
covariance matrix was determined by an iterative process. In the first step, the background 
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error covariance matrices from the full Kalman filter were averaged over all locations and time 
steps to obtain a first estimate. Then, a time series of the true background error vector bm = 

— was generated and used to obtain an estimate of the background error covariance 
matrix for the next iteration step. This step was repeated until the estimated background 
error covariance matrix converged, where the convergence was measured by the Probenius 
matrix norm. We found that this procedure was always convergent when all variables were 
observed. The estimate obtained this way is not necessarily optimal in the sense of providing 
the smallest possible analysis error of any constant background error matrix, but it has the 
desirable feature that the background error statistics are correctly estimated by the analysis 
scheme. This is a big advantage compared to the operational schemes, for which the estimate 
of the background error covariance matrix has to be computed by rather ad hoc techniques, 
since the true state, and therefore the true background error statistics, are not known. Thus, 
it might be assumed that our "conventional method" provides an estimate of the analysis 
error that is of good accuracy as compared to analogous operational schemes. 

For reduced observational networks {O < M) , the background error covariance matrix was 
determined by starting the iteration from the background error covariance matrix for O + l. It 
was found that, when more than a few observations (more than 6 for M = 40) were removed, 
our iterative determination of background error covariance matrices started to diverge after 
an initial phase of convergence. This probably occurs because the background error becomes 
inhomogeneous, due to the inhomogeneous observing network, and the average background 
error underestimates the error at the locations where the background error is larger than 
average. This leads to a further increase of the background error at some locations, resulting 
in an overall underestimation of the background error. This highlights an important limitation 
of the schemes based on a static estimate of the background error covariance matrix: The 
data assimilation scheme must overestimate the average background error in order to prevent 
the large local background errors from further growth. Keeping this in mind, we chose that 
member of our iteration scheme that provided the smallest analysis error. 
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5.3.3 Direct insertion 

We now give a third standard designed to decide wlietlier the data assimilation schemes pro- 
vide any useful information compared to an inexpensive and simple scheme, not requiring 
matrix operations. This scheme, called direct insertion, updates the state estimate by replac- 
ing the background with the observations, where observations are available, and leaving the 
background unchanged, where there are no observations. 

5.4 Implementation of the Local ensemble Kalman filter 



We now describe the implementation of our method on the L96 model. From ()28|) we know 
that for our OSSE's (Rm is the 0x0 unit matrix), the analysis error covariance matrix is 
= (Ptr^ + QmH^HQ„]"\ where H^H = diag[cri, era, . . . , cr^, . . . , ctm] and (T„=1 if 
there is an observation at grid point m and is zero otherwise. (Here the local regions are 
labelled by a single subscript m (rather than mn as used in Section 2.4) corresponding to 
the one dimensional spatial variable m in (j87|) .) [When observations are at all M grid points 
H^H = I, and since Q^Qm commutes with P|^, we have that Pj^ and commute. Thus, 
when every point is observed, (|^. (jSZj), and are identical, and solution 1, 2, and 3 are 
the same.] We implement this solution using (fTUj). (f77j) and (PHj) of Appendix B. 

In our experiments, the local analysis covariance matrix is computed by (|29|) and the local 
analysis is obtained by (jTfj) . The analysis ensemble is updated by (fTHj) . (fTTj) and of 
Appendix B, and the variance of the background ensemble is increased by a factor of 1 + £ 
in each step using the enhanced variance inflation algorithm (see Appendix C for detail). 
The final analysis at each point m is computed by averaging {21' + 1) = 5 local analyses 
(see figure 01). We found, by numerical experimentation, that achieving the same accuracy 
requires fewer ensemble members using averaging instead of simply inserting the center point. 
Choosing I = 6, k = k' , and e = 0.12, values that for M = 40 gave the lowest mean error 
(0.2), we found that the mean error does not change with increasing M. 

Figure El shows, that when the ensemble has at least eight members, the analysis error 
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settles at the level (0.2) of the "optimal" scheme, independently of the number of variables. 
This is roughly consistent with the supposition of an effective correlation length for the dy- 
namics that is less than M. Thus our method appears to be effective on large systems of 
this type. Moreover, the (non-parallelized) analysis computational time scales linearly with 
the number of local regions (i.e., with M). This favorable scaling is to be expected, since the 
analysis computation size in each local region is independent of M. 

We note, that the aforementioned scaling property of the local Kalman filter is in contrast 
to the behavior of the full Kalman filter, which requires many more members, and also an 
increasing number of members for an increasing number of variables, to achieve the "optimal" 
precision. This demonstrates the potential superiority of the local Kalman filter in terms of 
computational efficiency when applied to large systems. On the other hand, it also means that, 
since the minimum error was independent of M, it suffices to use the smallest, 40- variable, 
system for further experimentation. 

5.5 Comparison of the data assimilation schemes 

The four data assimilation schemes (local ensemble Kalman Filter, full Kalman filter, con- 
ventional method, and direct insertion) were compared for different numbers of observations 
(figure ini). The two Kalman filter schemes give almost identical error results, although the full 
Kalman filter has a very small advantage. The two Kalman filter schemes and the conven- 
tional data assimilation scheme are always more accurate than direct insertion, indicating that 
they are always able to retrieve nontrivial, useful information about the true state. The two 
Kalman filter schemes, in addition, have a growing advantage over the conventional scheme as 
the number of observations is decreased. This shows that, as the observational network and 
the background error become more inhomogeneous, the adaptive nature of the background 
error covariance matrix in the Kalman filters leads to a growing advantage over the static 
scheme. 

The above numerical experimentation results provide a guide for making good parameter 
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choices in the case of the L96 model. In future apphcations to actual weather models, choices 
for the analysis parameters might similarly be determined by experimentation, but it would 
also be useful to obtain some guides for initial guesses of good parameter choices. 

5.6 Sensitivity to the free parameters 

The free parameters of our scheme are the dimensionahty of the local regions (which is 21 + 1), 
the rank of the covariance matrices (k), and the coefficient (e) in the enhanced variance 
inflation algorithm. These parameters have been fixed so far. In what follows, the sensitivity 
of the data assimilation scheme to the tunable free parameters is investigated by numerical 
experiments {k' is held fixed at A;' = 9). In these experiments, our 'true state' and observations 
are generated in the same way as in Whitaker and Hamill (2002) (O = M). Also, we use the 
same ensemble size as Whitaker and Hamill [k' + 1 = 10). Hence our analysis error results 
and theirs can be directly compared. 

In the first experiment the variance infiation coefficient is constant, e — 0.012, while the 
dimension of the local vectors (2/ + 1) and the rank (k) of the background covariance matrix 
are varied. The results are shown in Table 1. The scheme seems to be stable and accurate for a 
wide range of parameters. The optimal size local region consists of 2Z + 1 = 11, 13 grid points, 
at which rank k = 5, 6, 7, 8, 9 estimates of the background covariance matrix provide similarly 
accurate analyses. Moreover, rank 3 and 4 estimates lead to surprisingly accurate analyses for 
the smaller size {21 + 1 — 5, 7, 9) local regions. This indicates that the background uncertainty 
in a local region at a given time (P^) can be well approximated in a low (k) dimensional 
linear space. Our premise, that the dimension of this space can be significantly lower than the 
number of ensemble members {k' + 1) needed to evolve the uncertainty, proved to be correct 
for the L96 model. (We note that the local dimensionality k is also much smaller than the 
"global" Lyapunov-dimension, 27.1, of the system). On the practical side, this result suggests 
that, at least for the L96 model, the efficiency of the analysis scheme can be significantly 
improved by using ranks that are smaller than the dimension of the local vectors and the 
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number of ensemble members. We note that our best results are at least as good as the best 
results published in Whitaker and Hamill (2002) and attain the optimal value (0.20) from 
section 5.3. 

In the second experiment, the dimension of the local regions is constant (2/ + 1 = 13), 
while the rank and the variance inflation coefficient are varying. The results are shown 
in Table 2. While lower rank estimates of the background error covariance matrix require 
somewhat stronger variance inflation, the results are not sensitive to the choice of e once it 
is larger than a critical value. (By critical value we mean the smallest e that provides the 
optimal error). 

The second experiment was then repeated by using the regular variance inflation of Ander- 
son and Anderson (1999) and Whitaker and Hamill (2002). In the regular variance inflation, 
all background ensemble perturbations are multiplied by r = 1+5, where 5 is small, 1 ^ 5 > 0. 
This inflation strategy increases the total variance in the background ensemble by a factor 
of (1 + A) = 1 + 5^ + 2(5 ~ 1 + 25. It can be seen from Table 3 that, except for = 4, the 
critical value of e is less than half of the critical value of A. The main difference between 
the two inflation schemes is that the enhanced scheme inflates the dominant eigendirections 
of the background covariance matrix less aggressively, and the least dominant eigendirections 
more aggressively. The numerical results suggest that this feature of the scheme is beneflcial, 
indicating that the ensemble-based estimate of the background error is more reliable in the 
more unstable directions than in the other directions. This is also well illustrated by the 
quantitative results shown in Figure [3 To explain this flgure, we deflne the true background 
error, = — by the difference between the truth, and the background mean, 
x^. We also deflne hm = b^Um\ the component of along the semi-axis of the probability 
ellipsoid, corresponding to the jth largest eigenvalue, \m of Pj^, where j = 1, 2, ■ ■ ■ , /c. (The 
case /c = 2 is illustrated in Figure |H1) For an ensemble that correctly estimates the uncertainty 
in each basis direction, the time means of 

d^ = i}^l>^f'\ J = l,2,---,fc, (89) 
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should be close to one. When, for a given j, the ratio dm is smaller than one, the ensemble 
tends to overestimate the distance between the truth and the background in the direction. 
When dli^ is larger than one, the ensemble underestimates this distance. Figure [3 shows that 
with the enhanced variance inflation the behavior of the ensemble is much better than with 
the regular variance inflation. This is especially true for the less dominant eigendirections, 
for which the ensemble with regular variance inflation signiflcantly (by about a factor of 
6) underestimates the distance between the truth and the mean background. We found 
that ||bm|P — Yl^j=i^m ' ^^^6 background variance unexplained by the directions, Um , 
j = 1,2, ■■■,9; is about 3% of the true total variance (||bm|p) for all four cases shown in 
Figure [7| Thus the results indicate that the superior performance of the enhanced variance 
inflation is due to the better distribution of the variance between the resolved directions. 
We note that this advantage of the enhanced variance inflation could not be exploited if the 
analysis was not done in introduced in Section 2. Whether the advantage found for 
our enhanced variance inflation scheme carries over from the L96 model to a more realistic 
situation remains to be determined. 

An interesting feature is the anomalously large error value of 0.29 at A = 0.036, A; = 8 in 
Table 3. An inspection of the data revealed that the higher time average is associated with 
a sudden and short-lived high amplitude spike in the rms analysis error. A further analysis 
of the problem revealed that spikes occur very rarely and they usually have small amplitude 
(smaller than 1). On rare occasions, however, the spikes can have large amplitude (sometimes 
larger than 5), and they can last a few thousand time steps. This phenomenon is illustrated by 
FigureEl in which the flrst large spike occurs after more than 162,000 time steps (equivalent to 
about 104 years, assuming that one time step is equivalent to 6 hours) and lasts about 12,000 
time steps (12 years in real time), and a second large spike develops 230,000 time steps (146 
years in real time) later, which lasts for 3000 steps (2 years). The severity of this problem 
was studied by carrying out several long term integrations with different combinations of 
the tunable parameters. An interesting feature is that the spikes do not destroy the overall 
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stability of the cycle; the large errors always disappear after a finite time and the mean error 
is smaller than 0.3. (For the case shown in Figure El the time mean error is 0.23). Spikes 
occur regardless of the size of the local region, and the type of the variance inflation scheme. 
They become less frequent, however, as the rank and the variance inflation are increased. 
In particular, no spikes were observed for e > 0.022. This suggests that the easiest way to 
prevent the occurrence of spikes is to choose a large enough variance inflation coefficient. 

All results shown so far were obtained using ()93|1 to generate the analysis ensemble, X^. 
This scheme results in analysis perturbations of the form ^Xm*"* = (5xm*'"'"'* + 5xm*'"'^'* as 
required by In order to test the importance of including the small 5xm*'"'^'* = 

Xm component, the first experiment was repeated by using solution 1 [ (j^ ] for and 
6x.m^^^'^^ = instead of (jHE))- (Using solution 1 and (jHEI) would give the same result as (jHSI) 
for our choice of Rmn = I-) This modified scheme, restricting the analysis perturbations to 
the k dimensional space S^, is clearly inferior (compare Tables 2 and 4 and Tables 3 and 5). 
More precisely, the constrained scheme provides stable analysis cycles only if both k and e 
are relatively large. This is not unexpected, since setting the component 6x.m^^^^^ to zero 
artificially reduces the total variance, ||5xm*''||^. Increasing k decreases the reduction in the 
total variance, while increasing e compensates for an increasing part of the lost variance. Also, 
the constrained scheme is more stable when the enhanced variance inflation is used, indicating 
that correcting the distribution of the variance is not less important than increasing the total 
variance. 

5.7 Discussion of the significance of the numerical experiments 

Finally, we reemphasize that the significance of the results of all our numerical experiments 
on the toy model (jSTj) is limited. Many important factors of real weather forecasting are 
not represented (e.g., model error), and very idealized conditions are assumed (e.g., known, 
normal, uncorrelated, unbiased, observation errors, and no "subgrid scale" stochastic-like 
input to the evolution of the "truth" state). On the other hand, it is also probably reasonable 
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to assume that, if our assimilation procedure gave unfavorable results for our idealized toy 
model situation, then the scheme would also be unlikely to be effective in the real case. 
Thus, one can view the good results obtained with our assimilation scheme in these numerical 
experiments as necessary, but certainly not sufficient, for future successful performance in a 
real situation. 

6 Summary and conclusions 

In this paper, we have introduced a local method for assimilating atmospheric data to deter- 
mine best-guess current atmospheric states. The main steps in our method are the following. 

• The global analysis ensemble members are advanced by the atmospheric model to obtain 
the global background ensemble at the next analysis time. 

• In each local region, each background ensemble member's perturbation from the ensem- 
ble mean is used to construct a 'local vector'. 

• Each of the local vectors in the ensemble is projected onto the local low dimensional 
subspace. 

• The observations are assimilated in each local region. 

• The local analyses are used to determine the global analysis and an ensemble of global 
analysis states. 

• The cycle is then repeated. 

Numerical tests of the method using the Lorenz model, (jHTjl . have been performed. These 
tests indicate that the method is potentially very effective in assimilating data. Other po- 
tential favorable features of our method are that only low dimensional matrix operations are 
required, and that the analyses in each of the local regions are independent, suggesting the 



36 



use of efficient parallel computation. These features should make possible fast data assimi- 
lation in operational settings. This is supported by preliminary work (not reported in this 
paper) in which we have implemented our method on the T62, 28-level version of the National 
Centers for Environmental Prediction Medium Range Forecasting Model (NCEP MRF). The 
assimilation of a total number of 1.5 x 10^ observations (including wind, temperature, and 
surface pressure observations) ai k' = k = 39 and 2/ + 1 = 9 takes about 6 minutes CPU time 
on 40 2.8 GHz Xeon processors. 
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Appendix A: Global continuity of matrix square roots 

Not all matrix square root definitions yield global continuity. One particular important mech- 
anism for non-global-continuity of matrix square roots is that the eigenvectors of a globally 
continuous, symmetric, non-negative matrix, M(r), may not be definable in a globally contin- 
uous manner. In particular, for smooth variation of M(r) in two dimensions, it can be shown 
that there will generically be isolated points in space where two of the eigenvalues of M(r) 
are equal. Following previous terminology in the field of quantum chaos (e.g., Ott 2002), we 
call such points "diabolical points" (e.g.. Berry 1983). Assume that two eigenvalues of M(r) 
denoted ^i(r) and ^2(1"), are equal at the diabolical point r = r^, and denote their associ- 
ated orthonormal eigenvectors by vi(r) and V2(r). Now consider starting at a point Fq 7^ 
and following a continuous path C that encircles and returns to Tq. Then it is shown 
in the paragraph below that, with continuous variation of Vi(r) and V2(r) along the path, 
their directions are flipped by 180° upon return to Tq. This presents no contradiction, since 
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orthonormal eigenvectors are arbitrary up to a change of sign, but it shows that a specific 
choice of vi(r) and V2(r) cannot be defined in a globally continuous manner. The positive 

1 /2 

symmetric square root (M(r)) ' , 

(M(r))^/^ = ^er(r)v,(r)vJ(r), 

i 

is globally continuous because Vj(r)vJ(r) returns to itself upon circuit around a diabolical 
point, even though Vj(r) may fiip by 180°. Thus the solutions for Z^n given in (jSTj), 
and ()66|) will be globally continuous, since positive symmetric square roots are used. The 
Cholesky square root will also yield global continuity. On the other hand, as an example of 
one of the choices that is unsatisfactory, the matrix square root choice, 

yMW = (M(r))^/^[vi(r)|v2(r)|..-r 

is clearly not globally continuous if diabolical points are present. 

In order to see how the above discussed property of diabolical points arises, consider the 
case of a two dimensional matrix, 



A 



The two eigenvalues of this matrix are 




(90) 



+ y(r) 



1/2 



(91) 



The eigenvalues are equal when the square root is zero, i.e., when a{v) = (3{r) and 7(r) = 0. 
These equations represent curves in the two-dimensional r-space, and then, as illustrated in 
Figure ITTk . equal eigenvalues (,^i = ^2) generically occur at points of intersection of these 
curves (e.g. the point shown in the figure). It suffices to consider the neighborhood of 
such a diabolical point, and to use deformed coordinates in which 7 and (a — /?)/2 serve as 
axes (Figure ITTb). Assume that we circuit around the origin of this system on the circular 
path C of radius p, 1— >2— i>3— >-4— ^5 shown in Figure ITTb. Letting {a — P)/2 = pcos-d, 
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7 = psirn!}, this path corresponds to continuous increase of ^9 from to 27r. On this path one 
can show that 

/ sin'i9/2 \ f cosi9/2 \ 

A-6I = 2p ' ]. (92) 

y cos^9/2 J \ -cos'd/2 ) 

Thus the normahzed eigenvector corresponding to is vi = (cos'i9/2, sin'i9/2). As shown 
in Figure ITTb . continuous increase of d from point 1 to point 5 resuhs in a 180° flip in the 
orientation of Vi. Thus vi (and similarly V2) cannot be defined in a single- valued globally 
continuous manner. 



Appendix B: XJ^^ obtained directly from 'Ymn 

In this Appendix we show that Y^n as given by (jHK|l can be used to directly obtain the 
analysis = (A;')~^/^{5xmn | 5xmn | ■ ■ ■ | ^Xmn^"*^"*}- In section 4.4, we discussed a variety 
of ways to compute a matrix Y^n to use in (jl^ to obtain, via (jHSj) . the analysis components, 
^Xmn*'"'' = Qmn^Xmn, in the low dimeusioual subspace S^n- We now claim that p5|l . p6| . 
(03), and dHSI) imply that 

(The crucial difference between (jUHj) and is the absence of the superscribed circumflexes 
in fl93|) ) . Then in practice the columns of can be used directly in (jHH). First we note 
that premultiplication of by returns Then further premultiplication by Qmn, 

together with (j^D), yields A^^iXmn = QmnXmL This means that the projection of onto 
Smn agrees with We verify (jHSI) by showing that its projection into the complementary 

spaces §mn and S^n agree with the decomposition (jH^I) . It remains to show that the projection 
of onto Smn agrees with (jH^ . Operating on both sides of with Amn and using 
X^„ = Xj^„Qm„ in (jEHI), we have 

a{-L)x« = a(-^)X^ X^^ O iP^ Y^iT^ - I'lX'' +A(-^)X^ f94'l 

^^mn-'^mn ^^mn-'^mn-'^m'ny<>rnn\^ mn) \^mn ^J-'^ran ~ ^^nin-'^mn- \-^^) 
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Now we recall from section 2 that S^n. and S^n are constructed from spanning vectors that 
are eigenvectors of P^n- Thus E>mn and E>mn are invariant under Pj^„. Since = X^^X^„ 
(see equation [TUj) . we have that X^„X^ commutes with the projection operators Amn and 



Amn- Thus 



x'''^o - x^ x*^a(^)o -0 rq^^i 



where the second equality follows because QmnW is in for any /c- dimensional column 
vector w, thus yielding A^i"2Qm„ = 0. From (jUH) and (jHSI) we have A^2X5=^„ = A^2x^„ or 
(Jxmri^''"'' = 5yim}}'^\ as required by (jHUj) . This establishes (jUHj) . We find that use of can 
be potentially advantageous for efficient parallel implementation of our method. We plan to 
further discuss this in a future publication applying our local ensemble Kalman filter to the 
operational global model of the National Centers for Environmental Prediction. 



Appendix C: Enhanced Variance Inflation 

In section 4.6 we mentioned the modification of PJ^n or Pmn ^o prevent the occurrence of 
small eigenvalues in these matrices. Furthermore, we noted the possibility of an accompanying 
modification of the corresponding ensemble perturbations, so as to preserve the relation, 

^ k'+l 

Pmn = ^ ] ^^mni^'^mn) ■ (96) 
i=l 

In the above equation we have suppressed the superscript a or 6 with the understanding that 
(inni) can apply to either the analysis or background. 

We consider the example where Pmn is changed to a new covariance matrix by addition 
of a small perturbation in the form, 

= Pmn + ^Ifc, e > 0, (97) 

where 1^ denotes the k x k unit matrix, and A is the trace of Pmn! i-e., it is the sum of 
its eigenvalues, and thus represents the total variance of the ensemble. (The case = 2 is 
illustrated in Figure ITUl) Hence ()97|) increases the total variance by the factor (1 +e), where 
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we regard e as small, 1 ^ £ > 0. More importantly, for small e, the additional variance 
represented in (j^Tfj) results in a relatively small change in the largest eigenvalues of Pmn, but 
prevents any eigenvalue from dropping below {eA/k), thus effectively providing a floor on the 
variance in any eigendirection. Having modified Pmn to via (jUTj) . we now consider the 
modification of the ensemble perturbations, 5xmn, to another set of ensemble perturbations, 
Sx.mn, with the perturbed covariance, 

k'+l 



P*mn = l;J2^^^^niS^^r^nf. (98) 

i=l 



We use the result of sections 4.2 and 4.4 to choose the 5xmn to minimize the difference with 
6x.mn- This result is the same for all metrics Dmn that commute with Pmn (equivalently Pmn)- 
(Note that the solutions in (jUT|) are all the same if Dmn, Pmn Pmn commute.) Adopting 
this solution for 5xmn, we introduce the orthogonal eigenvectors of Pmn, which we denote 
Wmn- The result for 6x.mn is then 



^^mn ~ '^mn^^rnn, (99) 

where 

k 

Z* = V f (^'^ w(^) (w(^) f (100) 

mn / ^ ^mn mn\ mnJ \ 

with 



^i?i = ^Jl + {eA/kvli>n), (101) 

and ijmn is the eigenvalue of Pmn corresponding to Wml; that is, PmnWmn = r]mnWmn- 

Recalling that P[^„ is diagonal (see (jSHI)), we see that in the case Pmn = P\nn (which is 
employed in section 5) the ith component of the vector is 5ij. Consequently, for this 
case, ()100|) and ()101|) imply that ZJ„^ is diagonal, 

Zl,^ = diag{ii,i2,---,ik). (102) 

In the case Pmn = Pmn, could combine variance inflation and a procedure for obtaining 
the analysis ensemble {5xmn} (e.g., solutions 1, 2,, or 3 of section 4.2): First inflate Pmn, 

-pa* _ -pa _j_ f^a 
mn mn ~^ ^mn, 

41 



where G^^ is any chosen inflation; and, second, replace by in the chosen algorithm 
for determining the analysis ensemble. 
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Table Captions 



Table 1. Dependence of the time mean rms error on the box size {21 + 1) and the rank (k) 
of the background covariance matrix. The symbol D stands for time mean rms errors larger 
than one, which is the rms mean of the observational errors. The coefficient of the enhanced 
variance inflation is e = 0.012. 

Table 2. Dependence of the time mean rms error on the coefficient {e) of the enhanced 
variance inflation scheme and the rank [k) of the background covariance matrix. The meaning 
of D is the same as in Table 1. The window size is 13. 

Table 3. Dependence of the rms analysis error on A in the regular variance inflation scheme 
and the rank {k) of the background error covariance matrix. The meaning of D is the same 
as in Table 1. The window size is 13. 

Table 4. Same as Table 2 except that Solution 1 and 5xm*^*'^'* = is used (instead of IHSl) to 
obtain the analysis ensemble. 

Table 5. Same as Table 3 except that Solution 1 and 6x.m^^^^^ = is used (instead of 1221) 
obtain the analysis ensemble. 
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Figure Captions 



New Figure 1. 

Illustration of the Local Ensemble Kalman Filter scheme as given by the six steps hsted 
in the introduction. The symbols in the figure are as follows: 

• x"('^(r, t) = the analysis ensemble fields as a function of position r on the globe at time 
t. 

• x.mn{r, t) = ensemble of background atmospheric state in local region mn. 

• ^mn — the local low dimensional subspace in region mn. 

• x^„(i) = the mean analysis state in S^„. 

• P^„(i) = the analysis error covariance matrix in 

Figure 1. Probability ellipsoid for x^„. 

Figure 2. Illustration of the local region {I — 5) with a central region {I' — 2). 

Figure 3. The rms error of the full Kalman filter as function of the number of ensemble 
members. Shown are the results for M — AO (sohd fine), M — SO (dashed fine), and M — 120 
(dotted-dashed fine). 

Figure 4. The rms error of the local ensemble Kalman filter as function of the number of 
ensemble members. Shown are the results for M — AO (sohd fine), M — SO (dashed fine), and 
M = 120 (dotted-dashed line). 

Figure 5. The rms error of the different analysis schemes as function of the number of 
observations. Shown are the results for the full Kalman filter [4% variance infiation] (dashed 
line), conventional scheme (dashed-dotted line), direct insertion (solid line with diamonds), 
and the local ensemble Kalman filter [3% variance infiation] (solid fine) . 
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Figure 6. The ratio at m = 1 as function of j for two different values of e and A. 

Figure 7. Projection of the true background error, b^, on the main axes of the probability 
ellipsoid. 

Figure 8. The long time evolution of the rms analysis error for a set of parameters that 
allow spikes to occur. 

Figure 9. The effect of the enhanced variance inflation (equation Wl\ on the probability 
elhpsoid. For the case P^n = P^n; V^i = Xml. and t]1^1 = A^^L 

Figure 10. (a) The curves 7(r) = and a(r) = f3{r) cross at a diabolical point r^. (b) The 
eigenvector vi flips by 180° on one circuit around the diabolical point. 
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k 

21 + 1 


3 


4 


5 


6 


7 


8 


9 


5 


0.24 


0.23 












7 


0.22 


0.22 


0.21 


0.22 








9 


0.22 


0.21 


0.21 


0.21 


0.21 


0.21 




11 


D 


D 


0.20 


0.20 


0.20 


0.20 


0.20 


13 


D 


D 


0.20 


0.20 


0.20 


0.20 


0.20 


15 


D 


D 


D 


0.22 


0.20 


0.2 


0.20 



Table 1: Dependence of the time mean rms error on the box size {21 + 1) and the rank (k) 
of the background covariance matrix. The symbol D stands for time mean rms errors larger 
than one, which is the rms mean of the observational errors. The coefficient of the enhanced 
variance inflation is e = 0.012. 
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k 

e 


4 


5 


6 


7 


8 


9 


0.008 


D 


D 


0.44 


0.20 


0.20 


0.20 


0.010 


D 


D 


0.2 


0.2 


0.20 


0.20 


0.012 


D 


0.20 


0.20 


0.20 


0.20 


0.20 


0.014 


D 


0.20 


0.20 


0.20 


0.20 


0.20 


0.016 


D 


0.20 


0.20 


0.20 


0.20 


0.20 


0.018 


D 


0.20 


0.20 


0.20 


0.20 


0.20 


0.020 


0.21 


0.20 


0.20 


0.20 


0.20 


0.20 



Table 2: Dependence of the time mean rms error on the coefficient (e) of the enhanced variance 
inflation scheme and the rank (k) of the background covariance matrix. The meaning of D is 
the same as in Table 1. The window size is 13. 
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k 

A 


4 


5 


6 


7 


8 


9 


0.020 


D 


D 


0.50 


0.30 


D 


D 


0.024 


D 


0.87 


0.42 


0.21 


0.21 


0.21 


0.028 


0.21 


0.36 


0.20 


0.20 


0.20 


0.20 


0.032 


0.20 


0.20 


0.20 


0.20 


0.20 


0.20 


0.036 


0.20 


0.20 


0.20 


0.20 


0.29 


0.20 


0.040 


0.20 


0.20 


0.20 


0.20 


0.20 


0.20 


0.044 


0.20 


0.20 


0.20 


0.20 


0.20 


0.20 


0.048 


0.20 


0.20 


0.20 


0.20 


0.20 


0.20 



Table 3: Dependence of the rms analysis error on A in the regular variance inflation scheme 
and the rank (k) of the background error covariance matrix. The meaning of D is the same 
as in Table 1. The window size is 13. 
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k 


4 


5 


6 


7 


8 


9 


inflation coefficient e 














0.010 


D 


D 


D 


0.41 


0.20 


0.20 


0.012 


D 


D 


D 


0.27 


0.20 


0.20 


0.014 


D 


D 


D 


0.21 


0.20 


0.20 


0.016 


D 


D 


D 


0.21 


0.20 


0.20 


0.018 


D 


D 


0.46 


0.21 


0.20 


0.20 


0.020 


D 


D 


0.28 


0.21 


0.20 


0.20 


0.022 


D 


D 


0.23 


0.21 


0.20 


0.21 


0.024 


D 


D 


0.22 


0.21 


0.21 


0.21 



Table 4: Same as Table 2 except that Solution 1 and 5xm = is used (instead of ESI) 
obtain the analysis ensemble. 
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k 


4 


5 


6 


7 


8 


9 


A 














0.020 


D 


D 


D 


D 


D 


D 


0.024 


D 


D 


D 


D 


0.75 


0.21 


0.028 


D 


D 


D 


D 


0.22 


0.21 


0.032 


D 


D 


D 


D 


D 


0.20 


0.036 


D 


D 


D 


D 


0.21 


0.25 


0.040 


D 


D 


D 


0.22 


0.21 


0.20 


0.044 


D 


D 


D 


D 


0.20 


0.20 


0.048 


D 


D 


D 


0.25 


0.20 


0.20 



Table 5: Same as Table 3 except that Solution 1 and 5xm = is used (instead of ESI) 
obtain the analysis ensemble. 
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Obtain global 
ensemble 
analysis fields 
(step 5) 





Evolve model 
from t — Ato t 
(step 1) 



p« (f) (f) 

mn\ Ji mn\ J 

(step 4) 



x^W(r,t) 

Form local 
vectors 
(step 2) 




The symbols in the figure are as follows: 



"mn 



Project to 
and do analysis 
(step 3) 



• x"*^*''(r,t) = the analysis ensemble fields as a function of position r on the globe at time t. 

• XTOn(r,t) — ensemble of background atmospheric state in local region mn. 

• = the local low dimensional subspace in region mn. 

• x^„(t) = the mean analysis state in S^„. 

• P^„(i) = the analysis error covariance matrix in 

Figure 1: Illustration of the Local Ensemble Kalman Filter scheme as given by the six steps 
listed in the introduction. 
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(f) 

mn\ J 



Figure 2: Probability ellipsoid for x! 



6 

mn" 
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2/ + 1 



Figure 3: Illustration of the local region {I — 5) with a central region {I' — 2). 



56 




100 

ENSEMBLE SIZE 



200 



Figure 4: The rms error of the full Kalman filter as function of the number of ensemble 
members. Shown are the results for M = 40 (solid line), M = 80 (dashed line), and M = 120 
(dotted-dashed line). 
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Figure 5: The rms error of the local ensemble Kalman filter as function of the number of 

ensemble members. Shown are the results for M = 40 (solid line), M = 80 (dashed line), and 
M = 120 (dotted-dashed line). 
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NUMBER OF OBSERVATIONS 



Figure 6: The rms error of the different analysis schemes as function of the number of obser- 
vations. Shown are the results for the full Kalman filter [4% variance inflation] (dashed line), 
conventional scheme (dashed-dotted line) , direct insertion (solid line with diamonds) , and the 
local ensemble Kalman filter [3% variance inflation] (solid hne). 



59 



Enhanced variance inflation with 8=0.015 / ^ 

Enhanced variance inflation with 8=0.01 / / 

Regular variance inflation with A=0.03 / / 
Regular variance inflation with A=0.04 / / 



. 1 > ' 



1 2 3 4 5 6 7 8 

Eigenvalue number j 

Figure 7: The ratio at m = 1 as function of j for two different values of e and A. 
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Figure 8: Projection of the true background error, b^^ on the main axes of the probabihty 
eUipsoid. 
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0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 



Time xio' 

Figure 9: The long time evolution of the rms analysis error for a set of parameters that allow 
spikes to occur. 
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probability ellipsoid 
before inflation 
(associated with Pmn) 




Figure 10: The effect of the enhanced variance infiation (equation PTjl on the probabihty 
elhpsoid. For the special case Pmn = P^n) Vml = Amn and rjml. = Xmn- 
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(a) 7 (b) 




Figure 11: (a) The curves 7(r) =0 and a{r) = /3(r) cross at a diabolical point r^. (b) The 
eigenvector vi flips by 180° on one circuit around the diabolical point. 
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